A Robust Approach for Continuous Interactive Actor-Critic Algorithms
نویسندگان
چکیده
Reinforcement learning refers to a machine paradigm in which an agent interacts with the environment learn how perform task. The characteristics of may change over time or be affected by disturbances not controlled, avoiding finding proper policy. Some approaches attempt address these problems, as interactive reinforcement learning, where external entity helps through advice. Other approaches, such robust allow task, acting disturbed environment. In this paper, we propose approach that addresses problems dynamic environment, advice provides information on task and dynamics Thus, learns policy while receiving We implement our version cart-pole balancing simulated robotic arm organize objects. Our results show proposed allows complete satisfactorily dynamic, continuous state-action domain. Moreover, experimental suggest agents trained are less sensitive changes than agents.
منابع مشابه
Actor-critic algorithms
We propose and analyze a class of actor-critic algorithms for simulation-based optimization of a Markov decision process over a parameterized family of randomized stationary policies. These are two-time-scale algorithms in which the critic uses TD learning with a linear approximation architecture and the actor is updated in an approximate gradient direction based on information provided by the ...
متن کاملNatural actor-critic algorithms
We present four new reinforcement learning algorithms based on actor–critic, natural-gradient and function-approximation ideas, and we provide their convergence proofs. Actor–critic reinforcement learning methods are online approximations to policy iteration in which the value-function parameters are estimated using temporal difference learning and the policy parameters are updated by stochasti...
متن کاملIncremental Natural Actor-Critic Algorithms
We present four new reinforcement learning algorithms based on actor-critic and natural-gradient ideas, and provide their convergence proofs. Actor-critic reinforcement learning methods are online approximations to policy iteration in which the value-function parameters are estimated using temporal difference learning and the policy parameters are updated by stochastic gradient descent. Methods...
متن کاملNatural-Gradient Actor-Critic Algorithms
We prove the convergence of four new reinforcement learning algorithms based on the actorcritic architecture, on function approximation, and on natural gradients. Reinforcement learning is a class of methods for solving Markov decision processes from sample trajectories under lack of model information. Actor-critic reinforcement learning methods are online approximations to policy iteration in ...
متن کاملVariance Adjusted Actor Critic Algorithms
We present an actor-critic framework for MDPs where the objective is the variance-adjusted expected return. Our critic uses linear function approximation, and we extend the concept of compatible features to the variance-adjusted setting. We present an episodic actor-critic algorithm and show that it converges almost surely to a locally optimal point of the objective function. Index Terms Reinfo...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: IEEE Access
سال: 2021
ISSN: ['2169-3536']
DOI: https://doi.org/10.1109/access.2021.3099071